*RCRA Nationwide Hedonic Study
*Distance gradient graphs
*Created: 5/21/2020
*Created by: Dennis Guignet
*Last Revised: 05/22/2023
*Last Revised by: Dennis Guignet

********************************************************************************

*This do-file takes the completed transaction dataset of all transactions in 
*	the US that are within five kilometers of a TSD facility under RCRA, 
*	and performs some initial analysis to inform decisions of the spatial extent
*	of the treatment and control groups. More specifcially, the results below 
*	are used to later generate Figure 3 in the main text. 

********************************************************************************
********************************************************************************

*set empty cells for factor variables to drop
set emptycells drop
clear all
*increase max variables allowed b/c factor variables
set maxvar 100000


*bring in dataset of just home sales WITHIN 5km OF A CORRECTIVE ACTION
use "$salesfolder\All_Sales_Final_Cleaned_CA5k", clear
count

*set key global variable groups

*house structure and local neighborhood vars
global house lnacres lnacres_miss stories stories_miss bathtot bathtot_miss lnsqft ///
	lnsqft_miss age agesq age_miss p_nbdev_2011_200 p_nbdev_2011_500 hwy500m 

*code up global variables for TSD control group dummies and counts
local vars  cntTSD
foreach v of local vars {
	global `v' `v'0_250 `v'250_500 `v'500_750 `v'750_1000 ///
		`v'1000_1250 `v'1250_1500 `v'1500_1750 `v'1750_2000 ///
		`v'2000_2250 `v'2250_2500 `v'2500_2750 `v'2750_3000 ///
		`v'3000_3250 `v'3250_3500 `v'3500_3750 `v'3750_4000 ///
		`v'4000_4250 `v'4250_4500 `v'4500_4750 `v'4750_5000 
	}	
*code up global variables for Corrective Action stage dummies
	*Note: Need to omit farthest bin so that gradients represent price effects
	*	with respect to distance, relative to farthest homes. Although this is 
	*	is not technically needed for identification, b/c dummy categories are not
	*	mutually exclusive (i.e., a house can have a site in several bins), 
	*	it's needed for interpretation.  In other words, to interpret it as 
	*	if it was the strict case where each house only had one site, in one bin 
	*	and stage. Want to interpret coefficients as relative to farthest homes
	*	in initial period. Then the estimates represent the post-event change
	*	among closer homes.
local stages pre mid post 
foreach s of local stages {
	global d`s'CA d`s'CA0_250 d`s'CA250_500 d`s'CA500_750 d`s'CA750_1000 ///
		d`s'CA1000_1250 d`s'CA1250_1500 d`s'CA1500_1750 d`s'CA1750_2000 ///
		d`s'CA2000_2250 d`s'CA2250_2500 d`s'CA2500_2750 d`s'CA2750_3000 ///
		d`s'CA3000_3250 d`s'CA3250_3500 d`s'CA3500_3750 d`s'CA3750_4000 ///
		d`s'CA4000_4250 d`s'CA4250_4500 d`s'CA4500_4750 /*d`s'CA4750_5000*/
	}



********************************************************************************
********************************************************************************


*Hedonic Regression
set more off
cd "$resultsfolder"

*Initial model for distant gradient graph (Figure 3 in main text)
*Models w/ All three stages: Tract FE and County by Year and County by Quarter FE. 
reghdfe lnrprice $cntTSD $dpreCA $dmidCA $dpostCA, ///
	absorb(i.mycntyid#i.tranyr i.mycntyid#i.quarter i.mytractid ///
	i.mycntyid#i.tranyr#c.($house)) vce(cluster mycntyid)  compact poolsize(20)
eststo m1_distgrad
estimates save "$raw_resultsfolder\m1_distgrad", replace

estimates use "$raw_resultsfolder\m1_distgrad"
test _b[dpreCA0_250]=_b[dmidCA0_250]
test _b[dpreCA250_500]=_b[dmidCA250_500]
test _b[dpreCA500_750]=_b[dmidCA500_750]

test (_b[dpreCA0_250]=_b[dmidCA0_250]) ///
	(_b[dpreCA250_500]=_b[dmidCA250_500]) ///
	(_b[dpreCA500_750]=_b[dmidCA500_750])

test _b[dmidCA0_250]=_b[dpostCA0_250]
test _b[dmidCA250_500]=_b[dpostCA250_500]
test _b[dmidCA500_750]=_b[dpostCA500_750]

test (_b[dmidCA0_250]=_b[dpostCA0_250]) /// 
	(_b[dmidCA250_500]=_b[dpostCA250_500]) ///
	(_b[dmidCA500_750]=_b[dpostCA500_750])


*Export results for distance gradient graph and table.
*export hedonic coefficient estimates in wide format for graphing in Excel	
esttab m1_distgrad using DistGrad_forGraph_AllStages_TractFE_CntyYrIntrxs.csv, replace label plain csv compress nogaps nolines nostar b(4) ci(4) wide noparentheses keep($dpreCA $dmidCA $dpostCA)
*regression model results
esttab m1_distgrad using DistGradHedReg_AllStages_TractFE_CntyYrIntrxs.csv, replace label csv compress nogaps nolines star (* 0.10 ** 0.05 *** 0.01) b(4) se(4) scalars(ll N_g) r2 ar2


*END







